Using Category-Based Adherence to Cluster Market-Basket Data

نویسندگان

  • Ching-Huang Yun
  • Kun-Ta Chuang
  • Ming-Syan Chen
چکیده

In this paper, we devise an efficient algorithm for clustering market-basket data. Different from those of the traditional data, the features of market-basket data are known to be of high dimensionality, sparsity, and with massive outliers. Clustering transactions across different levels of the taxonomy is of great importance for marketing strategies as well as for the result representation of the clustering techniques for market-basket data. In view of the features of market-basket data, we devise in this paper a novel measurement, called the category-based adherence, and utilize this measurement to perform the clustering. The distance of an item to a given cluster is defined as the number of links between this item and its nearest large node in the taxonomy tree where a large node is an item (i.e., leaf) or a category (i.e., internal) node whose occurrence count exceeds a given threshold. The category-based adherence of a transaction to a cluster is then defined as the average distance of the items in this transaction to that cluster. With this category-based adherence measurement, we develop an efficient clustering algorithm, called algorithm CBA (standing for Category-Based Adherence), for marketbasket data with the objective to minimize the categorybased adherence. A validation model based on Information Gain (IG) is also devised to assess the quality of clustering for market-basket data. As validated by both real and synthetic datasets, it is shown by our experimental results, with the taxonomy information, algorithm CBA devised in this paper significantly outperforms the prior works in both the execution efficiency and the clustering quality for marketbasket data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Combined Approach for Segment-Specific Analysis of Market Basket Data

There are two main research traditions for analyzing market basket data that exist more or less independently from each other, namely exploratory and explanatory model types. Exploratory approaches are restricted to the task of discovering cross-category interrelationships and provide marketing managers with only very limited recommendations regarding decision making. The latter type of models ...

متن کامل

Market Basket Analysis Visualization On A Spherical Surface

This paper discusses the visualization of the relationships in e-commerce transactions. To date, many practical research projects have shown the usefulness of a physics-based mass-spring technique to layout data items with close relationships on a graph. We describe a market basket analysis visualization system (MAV) using this technique. This system is described as the following: (1) integrate...

متن کامل

An Efficient Clustering Algorithm for Market Basket Data Based on Small Large Ratios

In this paper, we devise an efficient algorithm for clustering market-basket data items. In view of the nature of clustering market basket data, we devise in this paper a novel measurement, called the small-large (abbreviated as SL) ratio, and utilize this ratio to perform the clustering. With this SL ratio measurement, we develop an efficient clustering algorithm for data items to minimize the...

متن کامل

A Dynamic Analysis of Market Efficiency on Benchmark Crude oil markets: Based on the Adaptive Market Hypothesis

This paper examines the applicability of the adaptive market hypothesis (AMH) as an evolutionary alternative to the efficient market hypothesis (EMH) by studying daily returns on the three benchmark crude oils. The data coverage of daily returns is from January 2th 2003 to March 5th 2018. In this paper, two different tests in the form of two distinguished classes (linear and nonlinear) have bee...

متن کامل

Associated Map and Inter-Purchase Time Model for Multiple-Category Products

The continued rise of e-commerce is the main driver of the rapid growth of global online purchase. Consumers can nearly buy everything they want at one occasion through online shopping. The purchase behavior models which focus on single product category are insufficient to describe online shopping behavior. Therefore, analysis of multi-category purchase gets more and more popular. For example, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002